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1  .  IHTRODVJCTIOn 

Mathematics  has  throughout  the  history  of  science 
served  as  both  the  queen  and  handmaiden  of  other  disciplines 
in  providing  examples,  stimulation  and  working  tools  for 
thorn.  Computer  Science  has  joined  Mathematics  in  this  role 
almost  since  computers  appeared  on  the  scene. 

Me  describe  a  new  offspring  of  established  areas  of 
study,  which  we  tentatively  call  Theory  of  Strategies.  It 
is  characterized  by  the  problems  it  aims  at  solving  and  by 
the  methodology  it  would  use. 

At  the  start,  it  should  be  noted  that  the  distinction 
between  'strategic'  and  'tactical*  is  rather  moot  and  varies 
from  context  to  context.  \Je  shall  refer  to  strategic 
considerations  when  their  consequences  remain  relevant  to 
the  outcome  of  a  confrontation  throughout  the  conflict.  A 
strr-te  :  >  is,  of  course,  more  than  the  sun  of  the 

participating  tactics.  It  also  includes  the  means  of 
evaluating  the  adversary's  situation  and  actions,  scheduling 
of  one's  own  tactics,  and  making  use  of  feedback  from  the 
environment  in  modifying  the  rules  of  tactics  both  in  terns 
of  their  contents  and  their  inter-relations.  In  short, 
strategy  gives  tactics  its  mission  and  seeks  to  reap  its 
results. 

People  use  the  word  strategy  in  a  variety  of  contexts. 
Although  its  original  meaning  ("the  art  of  the  general”  in 
ancient  Greek)  refers  to  the  conduct  of  warfare,  the  tern 
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has  later  assumed  connotations  ranging  from  statesmanship 
and  management  of  national  policy  to  diplomacy  and  economic 
planning,  chiefly  after  the  theoretical  works  by  Karl  von 
Clausewitz  [1]  and  Antoi ne-Iienri  de  Jomini  [2].  After  John 
von  Neumann  and  Oskar  Mor gensterri  [33  showed  the  similarity 
between  the  gamelike  problems  in  economics,  sociology, 
psychology  and  politics,  the  concept  of  strategy  became 
pervasive  also  in  social  sciences.  Ue  talk  about  'problem 
solving  strategies'  or  the  'corporate  strategy'  in  a  large 
business  enterprise,  etc. ,  whenever  a  sequence  of 
goal-oriented  actions  is  based  on  large-scale  and  long-range 
planning. 

Me  shall  adopt  the  latter  type  of  interpretation  of 
strategy  and  investigate  how  Computer  Science  can  contribute 
to  strategic  planning. 

2.  THE  OBJECTIVES  OF  A  THEORY  0£  STRATEGIES 

The  following  is  a  list  of  some  of  the  objectives  of 
the  proposed  theory: 

.to  identify  adequate  computer  representations  of 
static  and  learning  strategies,  ’which  representations  can 
then  be  effectively  and  efficiently  employed  both  in  a 
simulated  world  and  in  direct  interaction  with  the  real 
world ; 

.to  develop  techniques  which  analyze  strategies, 
measure  the  performance  of  the  whole  strategy,  and  of  its 
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appropriately 


distinguished  components 


( "credit 


assignment"),  under  most  or  all  relevant  conditions; 

.to  observe  strategies  in  action — either  in  a  sequence 
of  unperturbed  conf rontations  with  others  or  under 
"laboratory  conditions"  when  the  environment  is  specified 
according  to  some  experimental  design — in  order  to  generate 
a  computer  model  (a  "descriptive  theory")  of  it; 

•to  combine  the  best  components  of  several  strategies, 
eliminate  the  redundancies  and  inconsistencies  among  these 
components  and  produce  a  strategy  that  is  normative  in  the 
statistical  sense; 

.to  establish  stochastic,  causal  relationships  between 
open  variables  that  can  be  measured  at  any  time  and  hidden 
variables  whose  values  can  be  identified  only  intermittently 
or  periodically,  in  order  to  find  out  the  actions  of  a 
strategy,  and  their  underlying  reasons  and  consequences; 

.to  create  a  system  that  can  be  taught  strategies  via 
principles  and  high-level  examples;  the  system  should  be 
able  to  make  inquiries  about  vague,  incomplete  or 
contradictory  advice,  and  to  apply,  evaluate  and  improve  the 
strategy  so  acquired. 


3.  SOME  HETH 


ISSUES 


The  mathematical  theory  of  games  has  given  us  a 
conceptual  framework,  a  useful  terminology  but  few  practical 
methods  to  solve  large-scale,  complex,  real-life  problems. 
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On  the  other  hand,  areas  of  study  such  as  decision  theory, 
utility  theory  and  operations  research,  could  make  important 
contributions  to  the  technique  v;c  consider  essential  in 
approaching  the  above  objectives. 

Me  propose  the  term  'Digital  Gaming'  (DG)  for  the 
computing  activity  that  incorporates  model-building, 
simulation  and  learning  programs.  A  programming  system 
dedicated  to  DG,  collaborating  with  human  decision-makers, 
would  eventually  assume  the  role  of  a  Co:.imand-and-Control 
unit. 

The  idea  of  attacking  the  problem  of  strategic  planning 
with  the  techniques  of  Computer  Science  has  several 
"by-products"  that  we  note  here.  As  any  person  in  computing 
would  tell,  when  one  has  to  formulate  a  problem  to  program 
it,  all  "intangibles"  must  be  described  so  as  to  be  amenable 
to  algorithmic  or  heuristic  treatment.  Such  description 
also  clarifies  the  thought  processes  of  the  experts  whose 
advice  and  experience  are  sought  in  establishing  the 
programming  system.  Thus  even  the  existing  techniques  are 
bound  to  improve. 

More  importantly,  the  designer  of  a  system,  working  in 
the  top-down  mode,  assumes  the  existence  of  modules  below 
the  one  he  is  concerned  with.  He  establishes  a  flow  of 
control  and  information  among  subsystems  which  will  be 
implemented  later  or,  possibly,  are  to  be  operated  by  human 
beings  for  some  time  to  come.  This  idea  should  encourage 
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continual  expansion  of  tiie  domain  already  automated. 

4 .  OH  DIGITAL  G  Al  ii  TIG 

As  noted  before,  DG  is  more  than  running  simulation 
models.  Vie  believe  in  the  utility  of  machine  learning, 
which  has  been  in  the  focus  of  our  interest  in  studying 
decision-making  under  uncertainty  and  risk  [4-8].  Learning 
programs  would  assume  an  important  role  in  DG  [11].  They 
would  continually  improve  the  performance  of  the  system 
whenever  (i)  better  responses  are  attainable  under  constant 
environmental  conditions,  (ii)  the  physical  environment  cr 
the  adversary  strategy  changes.  In  order  to  illustrate 
their  relevance,  we  describe  briefly  three  types  of  learning 
processes  (out  of  some  two  dozen)  and  a  high-level 
strategy-acquisition  technique  that  we  have  been  working  on 
over  the  past  several  years. 

5.  HIE  "BAYES  IAi!"  LF.Ain IIIG  LODE 

"Bayesian"  learning  processes  make  inductive 
inferences,  that  is,  draw  general  conclusions  from  specific 
events.  (The  name  refers  to  Bayes*  theorem  in  probability 
which  assumes  the  a  priori  knowledge  of  certain  conditional 
probabilities  of  certain  events  occurring  after  some  other 
events.)  They  modify  the  decision-making  rules  by  comparing 
predicted  outcomes  of  events  and  actual  outcomes.  There  are 
basically  three  ways  to  adjust  the  rules  to  bring  the  actual 


outcomes  closer  to  the  expected  ones 


If  a  number  of 


parameters  are  included  in  the  pre-established  heuristic 
rules,  a  learning  process  can  make  their  values  converge  to 
near-optimum  values.  Or  an  optimum  hierarchical  ordering 
for  the  heuristic  rules  can  be  found  experimentally.  It  is 
also  possible  to  generate  automatically  new  heuristic  rules, 
test  them  and  incorporate  the  successful  ones  into  the  new 
strategy — a  usually  difficult  and  time-consuming  process. 

It  should  be  noted  that,  in  accordance  with  the 
experimental  spirit  of  DG,  a  variety  of  "Bayesian"  learning 
processes  must  be  tried,  which  vary  in  the  type  and  amount 
of  information  they  collect  and  in  how  they  use  it. 

6 .  THE  QUASI-OPTIMIZER  (00)  SYSTEM 

Let  us  consider  an  environment  in  which  either  several 
organizations  arc  competing  to  achieve  an  identical, 
mutually  conflicting  goal,  or  else  a  set  of  alternative 
strategies  exist,  each  trying  to  win  against  an  identical, 
opposing  strategy  [9,131.  (One  can  assume,  for  the  sake  of 
generality,  that  a  goal  vector  is  specified  whose  components 
need  not  be  independent  in  real-life  confrontations;  for 
example,  in  air  battle  management,  the  ratio  of  targets 
accessed  and  enemy  air  defense  units  suppressed  are 
obviously  inter-related  goal  components.) 

Each  of  the  strategics  evaluates  the  environment  by 
measuring  certain  variables  (numerical  or  symbolic) 


available  to  it,  which  the  strategy  considers  relevant. 
Such  variables  nay  be  the  real  or  assumed  actions  of  the 
adversary,  the  perceived  state  of  the  confrontation, 
availability  and  capabilities  of  friendly  forces,  threat 
estimates,  criticality  and  vulnerability  of  the  adversary’s 
and  our  resources,  etc.  An  important  component  of  a 
strategy  is  aimed  at  interpreting  these  measurements  and 
incorporating  them  in  the  process  of  making  decisions  that 
can  lead  to  goal-achievement  (and  to  the  exclusion  of 
i;oal-achievement  by  the  adversary). 

The  environment  as  perceived  by  the  strategy  is  unclear 
because  some  information  nay  be  unavailable,  missing  (risky 
or  uncertain,  according  to  whether  or  not  the  relevant  a 
priori  probability  distributions  ore  known,  respectively )  or 
obscured  by  noise  (caused  accidentally  or  by  deliberate 
obfuscation).  If  the  decisions  based  on  such  incomplete 
and/or  inconsistent  information  are  less  sound  than  those  of 
the  adversary,  resources  will  be  wasted  and  goal  achievement 
will  be  farther  removed. 

Let  us  now  consider  how  we  could  generate  a  new 
strategy.  The  system  has  to  generate  automatically  a  model 
(a  descriptive  theory )  of  every  participating  strategy 
through  observation  and  measurements.  It  would  then,  have  to 
assign  to  each  component  of  the  models  some  measure  of 
quality;  that  is,  an  outcome-dependent  allocation  &£  credit 


must  be  made 
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TIi e  strategy  obtainable  from  the  best  components  of  the 
model  strategies  is  a  normative  theory  which  is  potentially 
the  best  of  all  available  ones,  on  the  basis  of  the 
information  accessible  by  us.  This  normative  strategy  is  in 
fact  only  e  u  a  s  i  -  o  o  t  i  m  urn  for  four  reasons.  First,  the 
resulting  strategy  is  optimum  only  against  the  original  set 
of  strategies  considered.  Another  set  may  well  employ 
controllers  and  indicators  for  decision-making  that  are 
superior  to  any  in  the  "training"  set.  Second,  the  strategy 
is  normative  only  in  the  statistical  sense.  Fluctuations  in 
the  adversary  strategy,  whether  accidental  or  deliberate, 
impair  the  performance  of  the  00  strategy.  Third,  the 
adversary  strategy  nay  change  over  time  and  some  aspects  of 
its  dynamic  behavior  may  necessitate  a  change  in  the  QO 
strategy*  Finally,  the  generation  of  both  the  descriptive 
theories  (models)  and  of  the  normative  theory  (the  QO 
theory)  is  based  on  approximate  and  fallible  measurements. 

The  system  under  development  employs  the  following 
modules : 

6.1  The  00- 1  assumes  a  monotonic  strategy  response 
surface  and  uses  either  exhaustive  search  or  binary  chopping 
to  construct  a  descriptive  theory  of  static  (non-learning) 
strategies. 

6.2  The  QO-2  extrapolates  a  finite  sequence  of  learning 
trees,  each  representing  the  same  strategy  at  different 
stages  of  development,  and  computes  their  asymptotic  form. 
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The  latter  will  then  be  used  in  constructing  the  nor;:. ati ve 
theory . 

6.3  The  QO- 3  minimizes  the  total  number  of  experiments 

QO-1  h.3s  to  perform.  It  no  longer  assumes  that  the  strategy 
response  surface  is  nonotonic  and  will  eventually  also  deal 
with  mul  ti-dimensional  responses.  00-3  starts  with  a 

balanced  incomplete  block  design  for  experiments  and 
computes  dynamically  the  specifications  for  each  subsequent 
experiment.  In  other  words,  the  levels  of  the  decision 
variables  in  any  single  experiment  end  the  length  of  the 
sequence  of  experiments  depend  Gn  the  responses  obtained  in 
previous  experiments. 

6.4  The  00-4  performs  the  credit  assignment.  That  is, 
it  identifies  the  components  of  ?.  strategy  and  assigns  to 
each  a  quality  measure  of  the  'outcomes’.  An  outcome  need 
not  be  only  the  immediate  result  of  a  sequence  of  actions 
prescribed  by  the  strategy  but  can  also  involve  lcng-range 
consequences  of  planned  actions. 

6.5  The  00-5  constructs  a  'Super  Strategy'  by  combining 
strategy  components  associated  with  outcomes  of  a  quality 
above  a  threshold  value. 

6.6  The  00-6  generates  a  Quasi-Optiraun  strategy  from 
the  Super  Strategy  by  eliminating  inconsistencies  and 
redundancies  from  the  latter.  It  also  tests  and  verifies 
the  QO  strategy  for  completeness. 

7.  IHK  ADVICE  TATER/IhQUIHER  SYTEti  (AT/ 1) 
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The  objective  of  this  system  is  to  establish  a 
nau-mnchi ne  environment  in  '/hie':  n  human  advisor  can  teach 


strategies  of  confrontation  on-line,  through 


:  i  f;les  am) 


high-level  exam pi os .  The  principles  and  examples  normally 
consist  of  situations  and  recommended  actions.  (Principles 
describe  rather  general  situations  defined  in  a  flexible 


manner  whereas  examples  are  soecific 


illustrate 


appropriate  behavior  in  a  general  situation  by  analogy  with 
a  particular  one.  Actions  can  cither  adhere  to  some  general 


guidelines  or  follow 


jOt  O  j 


sharply  defined 


prescriptions.)  Whenever  the  system  finds  the  advice  given 
to  be  vague,  incomplete  or  inconsistent  with  previously 
imparted  knowledge,  it  makes  inquiries  and  asks  for 
clarification.  The  advisor  can  define  and  re-define  the 


components  ci 


principle 


an'.’  tine. 


lie  can  also 


over-ride  temporarily  the  strategy  taught  so  far  by  issuing 

s  .ands.r . 

The  system  does  not  start  cut  with  a  blank  memory.  It 
knows  the  rules  governing  the  conf rontation,  the  variables, 
and  the  ranges  of  their  values  within  the  situation  space. 
The  advisor  can  at  any  time 

(i)  define  variables,  functions,  general  and 
specific  actions,  confrontation-related  adjectives, 
nouns  and  verbs— in  terms  of  constants,  confrontation 
parameters,  current  values,  overall  and  moving  averages 


of  statistical  values,  basic  confrontation  actions,  and 


Boolean  ami  relational  operators; 

(ii)  define  principles  of  a  strategy  '..'hie':!  connect 
a  situation  (specified  as  n  Boolean  conbination  of 
ranpes  of  statistical  variables— again  current  values, 
overall  or  movin'*  averages)  to  some  general  or  specific 
action; 

(iii)  (live  high-level  examples  by  connecting 
sharply  specified  situations  to  direct  confrontation 
actions ; 

(iv)  make  inquiries  about  definitions,  principles, 
and  values  of  statistical  variables  stored  so  far; 

(v)  issue  an  order  which  temporarily  over-rides 
the  strategy  acquired  so  far. 

In  turn,  the  system  can 

(a)  as!;  for  clarification  whenever  new  definitions 
are  v 03110  or  conflict  with  stored  ones,  or  the  strategy 
is  incomplete  in  not  covering  the  whole  confrontation 
space; 

(b)  return,  exemplary  actions  i  n  user-specified 
confrontation  situations,  in  accordance  with  the 
strategy  acquired; 

(c)  display  definitions,  principles,  confrontation 
parameters,  values  of  variables,  etc. 

Random  number  generators  also  have  a  role  in  defining 
game- theoretically  mixed  strategies.  A  sense  of  time  has 
also  to  be  incorporated  in  the  "tool  kit"  of  definitions, 


whether  it  refers  to  continuous  or  quantized  time  or  to  a 
counter  of  certain  specific  events. 

]■!  c  note  two  important  facilities  to  be  used  in 
specifying  principles.  L.et  us  call  these  A  U.  i  s  o  r  -  A  s  s  1  -in  e  U 
and  Adv i sor-De  £ i ned  Adversary  Types  ( A  A  A  T  and  ADAT, 
respectively).  In  the  former  case,  the  advisor  assigns  a 
certain  adversary  to  one  or  r.iore  categories  (Adversary 
Types)  named  by  him.  In  the  latter  case,  the  advisor 
defines  one  or  several  categories  by  boolean  combinations  of 
ranges  of  statistical  variables,  which  are  regularly  or 
continually  collected  over  the  adversary’s  actions.  (The 
variables  can  refer  to  current  values,  or  overall  or  moving 
averages.)  At  prescribed  intervals,  the  system  compares  the 
adversary  behavior  with  the  specif i cations  of  all  ADA?’ s. 
Accordingly,  each  adversary  (at  that  time)  may  belong  to 
various  Advisor-Defined  Adversary  Types.  Thus  the  principle 
describing  the  appropriate  action  can  refer  to  all  sue h 
adversaries  that  satisfy  the  definition  conditions  of  the 
Adversary  Type  at  hand. 

Advisor-defined  nouns  can  reasonably  bo  required  to  be 
unambiguous.  However,  adjectives  (arid,  to  some  extent, 
verbs)  must  often  have  different  meanings  when  used  to 
modify  different  types  of  nouns  (cf.  a  "strong  attach"  vs.  a 
"strong  concentration").  The  AT/I  system  has  to  distinguish 
(at  least)  four  different  classes  of  instances: 

(i)  Patent:  confrontation  parameters,  statistical 
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variables,  AT/I's  own  resources  (e.g.,  "IT  your  air 
superiority  is  more  than  2:1,  seed  air  battles.") 

(ii)  Interactive:  the  adversary's  actions  during 
current  confrontation  (e.g.,  "  1 1'  the  adversary  is 
bringing  up  additional  resources,  assume  a  holding 
position. ") 

(iii)  Statistical:  accumulated  data  about  the 
adversary's  past  behavior  (e.g.,  "If  the  adversary  is 
self-confident,  make  sudden  attacks.") 

(iv)  Inferential:  assumptions  about  the  intentions 
or  events  behind  the  adversary's  behavior  (c-.g.,  "If 
the  enemy  appears  to  have  received  additional  supplies, 
wait  for  conf irmation. ") 

This  classification  is  neither  exhaustive  nor 
exclusive.  If  the  Definition  Manager,  a  part  of  the 
programming  system,  cannot  decide  unambiguously  on  the  class 
into  which  the  components  of  the  definition  fall,  it  has  to 
consult  the  human  advisor. 

Another  difficulty  rests  with  the  need  to  resolve  a 
situation-dependent  conflict  between  principles  of  global 
and  monetary  relevance.  Fut'nernore,  the  system  must  be  able 
to  generate  di sambiguatiug  questions  whenever  the  relative 
importance  of  the  principles,  as  specified  by  the  advisor, 


is 


inconsistent  because  of  non- tr ansi tive  preferences  given 


IS 


in  ti.e  advice. 

Finally,  we  note  that  to  teach  a  strategy  by  telling 
how  to  do  tilings  in  general  is  more  efficient  and  less 
error-prone  than  to  tell  what  to  do  in  every  relevant 
situation.  An  AT/I-likc  system  would  have  practical 
usefulness  in  doing  this.  Human  experts  would  specify,  via 
a  hi c^i-lev el  interaction  with  the  machine,  a  number  of 
alternative  strategies.  Other  components  of  the  DG  system, 
such  as  a  QQ-like  system,  'would  then  generate  models  of 
uniform  structure  of  each  strategy.  A  prescriptive, 
quasi-optimum  strategy  would  then  finally  be  constructed 
from  these. 


The  system  under  construction  employs  the  following 
rxodu.l  os : 


7.1  The  AT/I-1  constructs  the  framework  for  the  flow  of 
information  and  control  between  the  AT/I  system  and  the 
Advisor. 


converts  the  principles  and  high-level 
canonical  form  and 


7.2  The  AT/ J-f 
examples  into  a 
embeds  them  into  an 
becomes  er.ipl oyabl e. 


initially  skeleton 


stores  then.  .’.'ext  it 
strategy  which  then 

i neon si ntencios  and 
acquired,  in  part  by 


7.3  The  AT/ 1-3  eliminates 
incompletenesses  from  the  strategy 
interacting  with,  the  Advisor. 


7. A  The  AT/I-J-r  tests  (verifies)  and 


evaluates  the 


p-f-*.  . 
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strategy  oon;.truci.o.i  according  to  a  Metric  i  oh  is 
independent  of  nay  particular  strategy. 

HUi  hi.:  fp  'LL ( c;*:-  >  r*Y-HXZU 


i  ho  ini  lorlyin  ;  mutiv  uticas  for  the  actions  presori  a  ed 
oy  a  stmte  ;y ,  tie  ncv.io.)j  themselves,  and  their 

consequence.;  are  not  neo  ;ss  urily  observable  and  measurable 
at  any  desired  tine.  Tiio  values  of  such  hi  ■  J  -loti  vari  a  hi  os 
can  be  identified  only  at  certain  tines,  either 
intermittently  or  periodically.  A1 
values  have  to  be  eat  inn  tod .  In 

variables  arc;  readily  nousurablt 

estimation  is  based  on  i.cneralir.i 

oxpresai  np,  stochastic,  causal  relations  between  open  an. I 
i  laden  variables.  Hither  can  be  cause  or  effect.  The  G P f 
sy  stem  is  designed  to  provide  dec  i  si on  support  for  expert 
systems  in  need  for  nunerie.il  estimates  of  hidden  variable 
v  a  1  u  e  r. . 

A  knowledge  base  is  established  over  a  period  of 
measurements.  It  consists  of  on  or  ler^d  set  of  generalised 
production  rules  of  the  form 
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behavior  of  the  k-th  open  variable  (OV) .  Tjfl  i3 
d  i  i'f  erence  in  tine  (tinelag)  or  in  space  (distance)  between 
the  start  of  the  j-th  norph  (in  case  of  a  trend)  or  its 
occurrence  (in  case  of  a  sudden  change  or  step  function), 
and  the  point  of  time  or  space  at  which  the  n-th  HV,  Hn> 

assumes  its  m-uk  value,  V-,.  This  difference  may  be 
positive — when  the  OV  is  the  cause  and  thus  precedes  the  HV, 
the  effect  —  or  negative  in  the  opposite  case.  The  term 
'la;-,'  is  used  for  Tjn>  whether  it  refers  to  a  timelag  or 
distance.  0r  the  credibility  level  of  the  r-th  rule. 

Its  value  is  between  0  and  1,  and  depends  on  two  factors: 

.how  we 11  the  morph  in  question  fits  the  datapoints 
over  its  domain,  and 

.how  many  and  how  similar  the  rules  were  that  have  been 
pooled  to  form  the  rule  at  hand. 

When  an  estimate  of  a  HV  value  is  desires  at  a  certain 
value  of  the  la;,  variable,  the  user  has  to  provide  in  its 
vicinity  a  sequence  of  values  of  all  available  O'.'1  a  that  are 
assumed  to  be  causally  related  to  the  HV.  These  sequences 
are  then  submitted  to  the  morph-f  ittinr;  program  ( hF!’ ) .  The 
system  then  loohs  in  the  knowledge  base  for  the  II  best 
estimates  (H  specified  by  the  user)  coming  from  rules  that 
.connect  the  11 V  sought  and  the  available  OV's; 

•refer  to  the  same  type  of  morph  as  the  newly  fitted 

one ; 

.involve  morph  parameters  and  lag  values  that  arc 
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"similar  enough"  to  those  in  the  query,  i.e.  that  are  within 
t he  user-specified  range  of  pooling  rules. 

The  so-called  confidence  level  of  the  estimate,  CQ 
depends  on  the  credibility  level  of  the  rule  used  as  well  as 
how  well  the  new  morph  fits  its  datapoints  and  how  close  its 
parameters  are  to  those  of  the  morph  matched  in  the 
knowledge  base. 

Let  us  now  assume  that  the  estimation  is  performed  and 
up  to  ii  values  of  the  HV  are  returned  for  each  lag  value 
that  yields  such  possibility.  The  system  will  calculate  the 
average  of  the  N  estimates  weighted  by  their  confidence 
level.  This  process  thus  provides  datapoints,  each 
specifying  weighted  average  11V  vs.  lag  value,  over  the  whole 
range  of  interest.  The  system  then  finally  invokes  the  f!FP 
to  produce  the  functional  form  desired.  Its  validity  is 
based  on  the  assumption  that  the  OV’s,  whose  morphs  were 
used  for  the  estimation,  have  obeyed  the  same  laws  when  the 
observations  were  made  for  the  knowledge  base  as  when  they 
were  measured  for  the  estimation.  Furthermore,  the 
relations  between  and  within  t'ne  groups  of  OY*s  and  MV’s 
are,  statistically  speaking,  constant  over  time. 

The  system  employs  the  following  modules: 

8.1  The  GP 11-1  fits  a  minimal  set  of  basic  patterns, 
morphs,  to  a  sequence  of  open  variable  datapoints. 

8.2  The  GPR-2  establishes  rules  between  sets  of 
parametric  values  of  morphs  describing  open  variable 


behavior  and  individual  values  of  hidden  variables. 

8.3  The  GPP, -3  pools  rules  that  connect  the  sane  open 
variable  and  hidden  variable  and  satisfy  certain  statistical 
and  rule-generation  criteria.  The  number  and  credibility 
of  rules  increase  with  experience. 

3.4  The  Op;; -4  estimates  the  values  of  hidden  variables 
at  desired  time  points. 

8.5  The  GPP- 5  extends  the  system  to  distributed 
processing  and  intelligence.  It  merges  source  files  and 
knowledge  bases,  established  at  different  observation  points 
by  satellite  computers,  if  certain  statistical  and 
f ile-generation  criteria  are  satisfied — as  verified  by  the 
system  automatically. 

8.6  The  GPR-6  extends  the  system’s  capabilities  to 
estimating  the  functional  form  of  hidden  variable 
distributions  rather  than  estimating  only  the  individual 
values  of  hidden  variables. 


9.  HR  TRAFFIC  CONTROL  (  ATC)  AS  A  TASK  F‘?VIHOr;‘i£f!T 
TliKOP.Y  OF  STRATEGIES 

Students  of  all  emerging  disciplines  soon  feel 


FOP  THE 


the  need 


to  employ  their  newly  developed  tools  on  some  real-life 
problems.  The  recent  shift  toward  applicable  research  in 
Artificial  Intelligence  clearly  indicates  that  this  area  of 
study  has  matured  sufficiently.  Production  systems 


incorporate  extensive  bases  of  expert  knowledge  in  a  variety 
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of  different  domains.  Event-driven  process  models  can 
simulate  realistic,  large  subsets  of  the  real  world. 
Problem-solving  techniques  have  become  powerful  enough  to 
control  complex  robot  behavior.  The  ATC  environment  seems 
to  have  the  following  important  qualifications  for  being 
studied  within  the  technical  and  conceptual  framework  of  the 
Theory  of  Strategies: 

.the  task  is  complex  enough  to  be  challenging; 

•one  can  identify  problem  areas  of  different  sizes  that 
could  be  attacked  successively; 

•one  can  define  plausible  metrics  along  several 
dimensions  to  measure  the  performance  of  a  proposed  system; 

.until  a  subsystem  is  fully  developed  and  tested,  it 
can  operate  in  a  realistic,  simulated  world; 

.a  successful  system  for  automatic  ATC  would  share  with 
systems  working  in  other  environments  the  important 
capabilities  of  planning,  problem-solving  and 

decision-making  under  uncertainty  and  risk-  in  dynamically 
changing  domains  while  satisfying  a  hierarchy  of 
constraints. 

Interest  in  automating  the  ATC  task,  has  increased  over 
the  past  few  years  t  19-253  .  The  need  for  radical 
modernization  of  the  current  mode  of  operation,  as  shown  by 
til e  number  of  near-misses  mostly  clue  to  errors  in  human 
judgement,  has  been  made  more  critical  by  the  recent 
controversy  between  the  Federal  Government  and  PATCO. 


PI 


In  the  following,  we  intend  to  discuss  the  above  issues 


briefly , 

outline  an 

"ideal"  A  T  C  system, 

and  shov 

how 

our 

present 

work  could 

contribute  to  the 

develop. 

.cut 

of 

automated  ATC. 

When  r.n  aircraft  flies  from  one  airport  to  another 
under  instrument  flight  rules  (aa  military  and  civilian 
planes  do),  it  passes  through  the  jurisdiction  of  a  series 
of  ATC  centers.  These  centers  track  each  flight  within 
their  sector  on  radar  and  try  to  keep  it  on  its  appointed 
path,  according  to  a  desired  tine  schedule.  The  control 
actions  must  also  satisfy  a  number  of  constraints.  Some  are 
constant,  such  as  the  government-proscribed  rules  for 
minimum  separation  and  the  physical  limitations  of  aircraft 
capabilities.  Other s  arise  from  the  situation,  such  as 
unfavorable  weather  conditions  and  emergency  landing 
priorities.  In  addition  to  safe  and  timely  take-offs, 
flights  and  landings,  fuel  economy  and  noise  pollution  over 
inhabited  areas  must  also  be  considered. 

The  above  microcosmos  is  well- structured  in  terns  of 
state  changes  over  space  and  time.  The  commands  and  pilot 
actions  are  drawn  from  a  small  standardized  set.  The 
measures  of  aircraft  perf orma.wce  are  simple,  such  as  flight 
time,  fuel  consumed,  and  number  and  degree  of  constraint 
violations.  Systems  competence  can  be  measured  along  the 
dimensions  of  the  number  and  the  duration  of  validity  of 
commands,  and  (assuming  perfect  adherence  to  the  commands) 


all  the  measure.;  of  aircraft  perf ormanee 


Sources  of  uncertainty  are  due  to  imperfect  a  elf.  ere  ace 
to  commands,  fuzziness  in  location  of  aircraft  on  rauar 
images,  suboptimal  commands  issued,  unexpected  environmental 
events/weather,  incoming  aircraft,  etc.) 

Figure  1  shows  an  idealised  arrangement  for  an 
automated  ATC  system.  The  strategy,  based  on  plans,  is 
tested  and  verified  in  the  Simulated  Viorld.  The 
consequences  of  the  actions  suggested  are  fed  back  to  the 
Decision-Making  unit  and,  if  the  results  are  unsatisfactory, 
the  actions  are  modified  as  long  as  necessary. 

The  actions  thus  proven  are  then  communicated  by  a 
human  controller  as  a  command  to  the  Real  Viorld.  Finally, 
the  status  of  the  Real  Uorld  updates  that  of  the  Simulated 
'.’or  1  d  at  regular  intervals  or  more  often  in  critical 
si tuations. 


FIGURE  1  Ah  Oil  I 


b'e  shall  show  how  our  present  work  can  contribute  to 


ATC  task.  Assume  that  ATC  trainees  specify  their 


control  strategy  in  terms  of  principles  and  high-level 
examples  to  the  Advice  Taker/Inquirer  (AT/I)  system.  The 
latter  is  linked  up  with  the  Simulated  VJorld  in  which  it 
tests,  verifies  and  evaluates  the  consequences  of  the 
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strategy  so  imparted.  It  seens  fro:.)  the  educational  point 
of  vie'.:,  therefore,  a  very  useful  feedback  loop  that 
involves  the  ATC  trainee,  the  AT / 1  system  and  the  Simulated 
'.’or  Id . 

The  Quasi-Optimizer  (00)  system  would  automatically 
generate  a  computer  model,  a  descriptive  theory,  of  the 
trainees'  strategies.  Finally,  it  would  create  a  normative 
theory,  quasi-optinun  for  reasons  described  before,  out  of 
the  descriptive  theories. 

In  view  of  the  well-defined  boundaries  cf  this 
problem-solving  universe  and  of  the  limited  set  of  distinct 
situations  and  actions,  it  is  likely  that  our  theoretical 
efforts  can  be  employed  for  this  important,  practical 
domai n. 

10.  FI  UAL,  CO!  ‘MFi'TS 


he  have  introduced  a  Theory  cf  Strategies  in  terns  of 
its  objectives  and  some  possible  techniques  and 
methodologies.  Ue  have  shown  approaches  to  automatic 


analysis  a 

ncl  synthesis 

of  strategies.  VJe 

have  introduced 

the  term  tern  Digital  Gam 

ing,  which  involves 

model-building, 

simulation 

and  machine 

learning  ideas. 

Digital  Gaining, 

augmented  w 

ith  the  tools 

of  operations  res 

enrch,  decision 

theory  and 

utility  theory,  would  provide 

a  computational 

environment. 

to  automate 

important  aspect 

s  of  strategic 

pi anning. 
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